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ABSTRACT 



A Turing machine multiplies binary integers on-Zi-ne if it 
receives its inputs low-order digits first and produces the 
Jth digit of the product before reading in the (j+l)st digits 
of the two inputs. We present a general method for converting 
any off-line multiplication algorithm which forms the product 
of two n-digit binary numbers in time F(n) into an on-line method 
which uses time only 0(F(n) log n) , assuming that F is monotone 
and satisfies n < F(n) < F(2n) /2 < kF(n) for some constant k. Applying 
this technique to the fast multiplication algorithm of Sch'dnhage 
and Strassen gives an upper bound of 0(n (log n) 2 loglog n) for 
on-line multiplication of integers. A refinement of the technique 
yields an optimal method for on-line multiplication by certain 
sparse integers. Other applications are to the on-line computa- 
tion of products of polynomials, recognition of palindromes, and 
multiplication by a constant. 
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1. Introduction 

The problem of finding the product of two integers expressed, say, 
in binary notation is basic to the study of computation, yet the number of 
computational steps required is still not well understood. The classical 
algorithm taught in school uses a number of steps that grows as 
n 2 to multiply n-digit numbers. More sophisticated algorithms have 
considerably reduced this rate of growth; the best algorithm known, due 
to Schonhage and Strassen [8], requires only 0(n log n loglog n) computa- 
tional steps. This bound can be achieved even on a multitape Turing 
machine, the model we investigate in this paper. However, no interesting 
lower bounds are known. It is not even known if the number of steps must 
grow faster than linearly in the length of the input. 

The on-line restriction constrains the manner in which a computation 
is carried out. Informally, a function is computed on-line if the input 
symbols are read sequentially, the output is produced sequentially, and 

■f-V» ^ t" 

the machine produces the j output symbol before reading the j'+l input 
symbol (s) . This is in contrast to off-line algorithms in which the machine 
has access to all input symbols before any output need be produced. The 
on-line model is a natural paradigm for interactive computing and process 
control where the future inputs depend in an unpredictable way on the 
current outputs. Our interest in the on-line restriction stems partially 
from such practical considerations and partially from the mathematical 
tractability of on-line computation which enables some nontrivial lower 
bounds to be obtained. 
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The particular on-line problem of integer multiplication is motivated 
by a deep result due to Cook and Aanderaa [1] and strengthened by Paterson, 
M. Fischer and Meyer [7] which says that for any of a broad class of machine 
models, a machine^} to perform on-line integer multiplication of n-digit 
numbers requires more than an log n/(loglog ri) steps for all sufficiently 
large n, where a > is a constant depending only on ^1. In the case of 
a one-dimensional multitape Turing machine, the bound becomes an log n. 
Unfortunately, the methods in [1] and [7] do not apply if the computation 
is done off-line. 

It is not immediately obvious that on-line multiplication is even 
possible, for the usual multiplication algorithms do not obey the on-line 
restriction. The reader can easily convince himself that on-line multipli- 
cation can in fact be done, but the best on-line algorithm previously known 
requires time 0(n 2 ) , leaving a considerable gap between the upper and lower 
bounds. The results given here significantly reduce this gap. 

Our main result is given as an off-line to on-line conversion method. 
That is, given a Turing machine OFF which computes integer multiplication 
off-line, we describe a Turing machine ONLINE which computes integer multi- 
plication on-line using OFF as a subroutine. We will henceforth assume 
that all Turing machines mentioned have one-dimensional tapes. 

Theorem 1 . Let OFF be a multitape Turing machine which performs off- 
line integer multiplication. Let F{n) bound the time required by OFF to 
multiply n-digit integers. Assume F is monotone and n < F(n) < F(2n)/2 
< kF(n) for some constant k > 1. Then there is a multitape Turing machine 
ONLINE which performs integer multiplication, obeys the on-line restriction, 
and produces the n output digit in 0(F(n) log ri) computational steps. 
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We remark that the constant coefficient implicit in the "0"-notation 
can be made arbitrarily small using linear speedup for Turing machines [4] 
with a consequent increase in the sizes of the tape alphabets and finite 
control. 

The following corollary is immediate using the method of Schb'nhage and 
Strassen referred to earlier. 

Corollary 2 . There is a multitape Turing machine which performs on- 
line integer multiplication and produces the n output digit in 
0(n (log n) 2 loglog n) steps. 

There are two senses in which Theorem 1 can be said to be optimal. 
First of all, the an log n lower bound of [7] for on-line multiplication 
by multitape Turing machines holds even if the machine is augmented with 
a special multiply instruction which performs off-line integer multipli- 
cation in real time. A real time computation produces one output digit at 
each step and hence multiplies n-digit integers within time In. The proof 
of Theorem 1 extends to these machines, resulting in a time 0(n log n) 
method. Thus, Theorem 1 is the best possible result for a general off- 
line to on-line multiplication conversion method on these augmented Turing 
machines and strongly suggests the same for ordinary multitape Turing machines 
as well. 

Another indication of the optimality of our basic conversion method 
is shown in Section 3 where we obtain a time 0(n log n) on-line algorithm 

for multiplying an arbitrary integer x by a particular number K , where 

n 

n is the length of x. Time on log n is required for this problem as well [7]. 

The definition of on-line computation is given in [1] and [5], and we 
repeat it here, defining also the concept of half-line computation which 
simplifies the description of the conversion method. Informally, a machine 
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computes a two argument function half-line if one entire argument is 
available off-line and the other is read subject to the on-line restriction. 

Definition . Let W be a machine which computes a function / on sequences, 

AAA 

where f: £ x £ ->- A , E and A are sets, ^jf is said to compute / on-line if 

for all input sequences a = a n a,. . .a , b = bj),...b (a., 2> . e £) and 

1 n' 1 n t J 

corresponding outputs f(a,b) = c^...^ (c . e A), 4 produces e, before 

reading either of a k+± or & fe+1 , < fe < ft-1. We assume here also that the 

inputs are read in sequence, so for all k, a, (£>-,) is read before a, (&, .). 

4*7 computes / half-line (with respect to the first argument) if 4ti 
produces c fe before reading a fe , < fe < ft-1. a will be referred to as the 
on-line argument and b as the off-line argument of the half-line product. 

2 . The Off-Line to On-Line Conversion 

2,1. Informal Description. 

We first give a general description of an on-line multiplication algorithm 
ONLINE, independent of the particular machine model on which it is to be pro- 
grammed, and omitting many of the bookkeeping and data management details. 
Section 2.2 gives the construction in detail. 

The definition of ONLINE is in terms of two auxiliary procedures ON and 
HALF, each of which uses a given off-line multiplication procedure OFF. 

0N{n) assumes that the first n/2 digits of the product of two ft-digit 
integer inputs have already been computed by prior calls to ON. It produces 
the next ft/2 digits on-line as the corresponding inputs are being read. It 
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then computes and stores the n high-order digits of the product in preparation 
for a subsequent call to 0N(2n) . 

HALF(n) forms the product of two n-digit integer inputs, producing the 
n low-order digits of the product half-line as the on-line argument is being 
read. It then computes and stores the additional n high-order digits. 

Let a, b be two n-digit binary* integers, where n is a power of 2, and write 

a = a 1 -2 n ' 2 + a , b = b ± -2 n/ + b Q where a ± , a Q , b ± , b Q are n/2-digit. 
Suppose ab = c and a = o -2 n + a -2 n/ + c , where a , o^ are n/2-digit and 

c„ is n-digit. 

1. a is the n/2 low-order digits of « n # & n - Let ^ denote the nil 

high-order digits of this product. 

2. a is the n/2 low-order digits of cL + ^q'^i + a,-b Q . Let d, 

denote the (roughly) n/2 high-order digits of this expression. 

3. a = d + a. -b . 

These definitions are illustrated in Figure 1. 

All that need be done is to perform the four n/2-digit multipli- 
cations a.'b. and the indicated additions in such a way as not to violate 
the on-line or half-line restrictions. Both procedures are of the same 
general form, differing only in input-output arrangements. 
To ON (n) : 

1. Compute d.. + a'b + <z '£>,. as a and b are being read by running 

two HALF(n/2) procedures in parallel, performing additions as the 
output digits are produced. a 1 , b are the on-line arguments and 

a„, b~ the off-line arguments of these two half-line products. 

2. After a and b-, have been read, compute cL + a n •£> by one use of 

2_ 1 / l i 

OFF and additions, and store the result. 



*. 



We assume base 2 numbers for convenience. Everything generalizes 
trivially to an arbitrary base. 
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Suppose now that a and b are to be multiplied half-line. Let a be the 
on-line and b the off-line argument. Assume the length of b is < ft. 

To HALF(n) : 

If ft = 1, 

1. Read a and print the digit ab . 
else 

2. Compute a "& half-line by HALF(n/2) as a is being read. 

3. After a has been read, compute a. -b off-line by using OFF, 
and store these digits. 

4. Compute a., 'b half-line by HALF(n/2) as the digits of a., are 
read, forming the digits of d. + a •£- + a... *£>_ as the outputs 
are produced. 

5. Compute <i + o.'b. off-line by using OFF and additions. 

The top-level procedure ONLINE multiplies two numbers on-line without 
needing to know their lengths in advance. It operates by calling ON(n) with 
n equal to successive powers of 2. 

To ONLINE: 

1. Read the first digit from each input and compute the first product 
digit. 

2. n -*- 2. 

3. ON(n). 

4. n <- 2n. 

5. Go to 3. 

Let F(n) denote the number of steps required by OFF to multiply n-digit 
numbers, and assume F satisfies the conditions of Theorem 1. Let N (n) and 
H(n) denote the number of steps required by ON(n) and HALF(n) respectively, 
for n a power of 2. Then the following relations hold: 



N(n) < 2H(n/2) + Fin/2) + cyz; (1) 

H(n) < {an if n = 1; (2) 

2H(n/2) + 2F(n/2) + o^n if n > 2. 

The terms c^n, an, where a and c„ are constants, bound the times required 

for additions and overhead. In the next section we suggest a Turing machine 

implementation in which these overheads are indeed bounded by 0(ft) . 

The relation (2) solves as: 

log n 

H(n) < I 2 l -F(n/2' 1 ) + a^ti\ + log n) = 0(F(n) log n) (3) 

i=l 

since n < F{n) < F{2n)/2 for all n a power of 2. (We assume the summation 

is zero when the upper limit log n = 0.) From (1) and (3), we immediately 

get N(n) = 0(F(n) log n) . 

th 
Let T{n) be the time when the n output digit is produced by ONLINE 

for arbitrary n, and let r = [log n~\ . Then 

v 
Tin) < I /l/(2 ^ ) + an < o 1 vFi2 T ) + an = 0(F(n) log n) 
i=l 

for constants c„ and a ' by the assumptions on F. 



2.2 Detailed Description. 

The descriptions of the procedures ONLINE, ON, HALF and OFF in Section 
2.1 leave somewhat vague both the order in which the computations are to be 
performed and the handling of the arguments to the functions. In this 
section, we define the routines more precisely to make clear that they are 
correct and that the on-line restriction is not violated. The implementation 
of these procedures on a multitape Turing machine to achieve the time bounds 
of Section 2.1 is straightforward. 
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The main procedure, ONLINE, takes two on-line arguments A and B and 
produces the product AB on-line. The inputs and outputs may be thought of 
as streams, that is, they are read and written sequentially, low-order 
digits first. READ(Z) reads and returns the next digit of the on-line 
argument X e {A, B}; PRINT (s) causes s to be produced as the next digit of 
output, s e {0, 1}. 

The other procedures take three kinds of arguments. An on-line 
argument just names an input stream. An off-line argument is 
a' binary string. A procedural argument is another procedure which may be 
passed to a subprocedure or executed directly by the called procedure. In 
the descriptions that follow, A and B denote on-line arguments, P denotes 
a procedural argument, and lower case letters are used for off-line arguments 
and variables. 

We first present the procedures and then prove them correct. The 
assertions of Lemma 3 describe exactly what ON and HALF are supposed to do, 
and Lemma 4 describes the effect of ONLINE. 

0FF(x, y) is the assumed off-line multiplication procedure with running 

time F(n) of Theorem 1. SPLIT(x, n) is a function which returns a pair of 

n n 

natural numbers (x^, x.) such that x -2 + x _ = x and x~ < 2 . CAT(x^, x n , n) 

n 
is the inverse of SPLIT and returns x "2 + ar . (When x~ is a length n 

number, this just performs concatenation of binary strings.) Since our 

numbers are represented in binary notation, it should be clear that SPLIT 

and CAT both run in time proportional to the length of x. 



-10- 



To 0N(n, d, a Q , b Q , A, B) : 

1. a +■ 0; 

2. Compute in parallel, starting with (i) and switching alternately 
between (i) and (ii) every time a READ instruction is about to be 
executed: 

(i) (p, a ) +- HALF(n/2, d, b , A, (Xx . t <- x)); 
(ii) (q, b ± ) «- HALF(n/2, 0, a , B, (Xx . (a, s) «- SPLIT(c + x + t, 1) ; 



PRINT(s))); 



3. d ^ a + p + q; 



4. e 2 +- d 2 + 0FF(a lS fc^ 



5. Return (a , a 1 , & ) 



Step 2 of this algorithm requires a little explanation. Statements 
2(i) and 2(ii) each have side-effects: 2(i) affects the value of the 
variable t, and 2(ii) affects the value of the variable o and causes 
printing. Also, both statements cause input to be read, 2(i) from stream 
A and 2(ii) from stream B. To insure that the on-line restriction is not 
violated and that the correct results are produced (since 2(ii) depends on 
2(i)), the statements must be executed in parallel, subject to the synchro- 
nization constraint that the j execution of the procedural argument of 

2(i) precedes the j execution of the procedural argument of 2(ii) which 

st 
in turn precedes the (j'+l) READ from either input stream. The method of 

execution proposed in Step 2 guarantees this sequencing, for each of 2(i) 

and 2(ii) causes precisely one evaluation of its procedural argument 

between successive calls on READ. (Cf. Lemma 3, part B.4.) 
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To HALFin, d, b, A, P) : 

If n = 1, then 

1. a *- READ(^l); 

2. (a, s) +■ SPLIT(d + ab, 1); 

3. Call P(s); 

4. Return (c, a); 
else 

5. (d ± , d) +■ SPLIT (d, n/2); 

6. (b ± , 2? ) «- SPLIT '(b, n/2); 

7. (p, a Q ) *- HALF(n/2, d Q , b Q , A, P) ; 

8. (d 2 \ cf 1 ') +- SPLITCp + d x + OFP(a , 2^), n/2); 

9. (q, a x ) + HALF(n/2, d ± ' , b Q , A, P) ; 

10. c 2 +- q + d 2 ' + 0FF(a , b ); 

11. a «- C4r(a , a , n/2); 



12. Return (c„, a) 



Lemma 3 . Let n be a power of 2. 
A. Assume n > 2 and let d, a„ and & n be numbers of length < n/2. Suppose 
(c 2> a^, b^) = 0N(n, d, a Q , b Q , A, B) , and ON produces the m-digit number 
e. as output. Then: 

A.l. ON reads n/2 digits from A; these digits are a... 

A. 2. ON reads n/2 digits from B; these digits are b, . 

A. 3. m = n/2. 

A. 4. ON obeys the on-line restriction. 

n/2 n/2 

A. 5. c 2 ' 2 + c ~\ = d + a r\'b-i + a -i 'bn + a,'b'2 
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B. Let d and b be numbers of length < n. Suppose (s , a) = 

HALF(n, d, b, A, P) , and HALF calls P r times with the successive digits 

of z n as arguments. Then: 

B.l. HALF reads n digits from A; these digits are a. 

B.2 . v = n . 

B.3. HALF produces no output (except as a possible result of calls 
on P) . 

B.4. For every j, 1 < J < n, HALF calls P for the J time before 

reading the j'+l S digit of A and after reading the j 
B.5. s'2 + 3 = d + ah . 

Proof . The proof is by induction on n. 

For n = 1, part A is vacuously true. Part B follows by inspection of 
the case n - 1 in the definition of HALF. 

Now, let n be a power of 2, and suppose the statement of Lemma 3 is 
true for every n' < n which is also a power of 2. Statements A.l - A. 3 
follow from the definition of ON and from B.l - B.3 for n' = n/2. A. 4 
follows from B.3 and B.4 for n' = n/2 and from the remarks following the 
definition of ON about the sequencing in Step 2. A. 5 follows by direct 
calculation from B.5 for n' = n/2 and the following: 
Claim: Let p be the sequence of n/2 digits with which the procedural argu- 



men 



t of Step 2(i) of ON is successively called, regarded as a binary integer, 



low-order digit first. Let q be the corresponding sequence for 2(ii). Let 
a be the sequence of digits printed by Step 2 of ON, and let a be the value 
of the variable by the same name at the completion of Step 2. Then 
a-2 n ' 2 + o-. = p + q . We leave the proof of this claim to the reader. 



-13- 



B.l and B.2 follow easily by induction, and B.3 follows by inspection, 
since HALF contains no PRINT statements. B.4 just says READs alternate with 
calls on P. This is again obvious since the only READ occurs in Step 1 of 
HALF and the only call on P is in Step 3. Finally, B.5 follows by induction 
using B.5 for n' = n/2 and direct calculation. Q 

We now define the main on-line multiplication procedure, ONLINE. 
To ONLINE (A, B) : 

1. a ^READ(/l); b Q «- READ(S) ; PRINT(a -b ) ; 

2. n -e 2; d ■*- 0; 

3. (c 2 , a 1 , Z? 1 ) «- 0N(n, d, a Q , b Q , A, B) ; 

4. (n, d, a Q> b Q ) *■ {In, c? 2 , CAT(a v a Q , n/2), CAT(b ± , b Q , n/2)); 

5. Go to 3. 

Lemma 4 . When Step 3 is about to be executed for the j time, let 
a be the m-digit number which has been printed so far. Then: 

1. n = 2° . 

2. m = n/2 . 

3. Exactly n/2 digits have been read from each of A and B. These 
digits are a n and 2? n respectively. 

4. The on-line restriction has not yet been violated. 

5. a 'O = d-2 + a . 

Proof . The proof is by induction on j. By inspection of the program 
ONLINE, the lemma holds for 3=1. The fact that it holds for J > 1 follows 
readily by direct calculation using the truth of the lemma for j - 1 and 
Lemma 3, part A. D 
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A few remarks are in order concerning the Turing machine implementation 
of these procedures. We may assume that the Turing machine has one work tape 
for each variable named in any of the programs. The recursive procedure 
calls are handled in the usual way using one additional tape as a pushdown 
store. At procedure entry, the old values of any local variables are saved 
on the pushdown store and the local variables are initialized; at procedure 
exit, the old values of the variables are restored. 

The only possible difficulty is with the implementation of Step 2 of 
ON. However, ON itself is not recursive, so we simply make two independent 
copies of the Turing machine which implements HALF and run them in parallel 
as required. 

3. Sparse Numbers 

Call a natural number k-sparse if its binary representation contains 
at most k one-bits. The purpose of this section is to show that the time 
bound of Corollary 2 for on-line multiplication can be improved if either 
of the two multiplicands is sufficiently sparse. In particular, when one 
of the numbers is (log n)-sparse, the time reduces to just 0(n log n) . 
Paterson, M. Fischer and Meyer [7] show that any Turing machine Wl for 
on-line multiplication of a length n number by the particular (log n)- 
sparse constant 

K = I 2 2 

n . L 

2 ^ < n 

requires time an log n, where a depends only on 4w. Thus, both bounds are 
optimal to within constant factors. 

Let onesQ)) denote the number of ones in the binary representation 
of b. We begin by describing an off-line procedure, OFFSPARSE(a, b) , which 
forms the product of two length n numbers a and b by repeatedly shifting 
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and adding b into an accumulator. More precisely, let (a n ...a„)„ be a 

n-1 2 

binary representation of a, and let I={i\a. =1}. Then a = £ 2\ 

so afc = )_ b'2 . The last summation has |J| = ones (a) terms, each of 

iel 

length at most 2n, so it can be computed on a Turing machine in time at 
most cn(ones(d) + 1) for some constant a > 0. 

OFFSPARSE(a, b) always computes the correct answer but is fast 
(relative to other multiplication algorithms) only when a is sparse. 
Similarly, OFFSPARSE{b, a) is fast only when b is sparse. Both of these 
methods take time 0(n 2 ) in the worst case. By running them in parallel with 
any other (fast) multiplication procedure, we obtain a method whose running 
time is proportional to the minimum of the times for any of the three 
procedures, yielding: 

Lemma 5 . Let OFF be a multitape Turing machine which performs off-line 
integer multiplication of length n numbers within time F(n) , where F satisfies 
the conditions of Theorem 1. Let G{ri) = F(n)/n. Then there is another 
multitape Turing machine OFFSP which performs off-line multiplication of two 
length n numbers a and b in time at most 

e.n (min{ ones (a) , ones(b), G(n)} + 1), 
where c > is a constant. 

Theorem 6 . Let OFF, F(n) and G(n) be as in Lemma 5. Then there is a 
multitape Turing machine ONLINESP which performs integer multiplication, 
obeys the on-line restriction, and produces the n output digit (n s 2) in 

c{n min{ones(a(n)) , ones(b(n)) , (log n)G(n)} + n log n) 
computational steps, where a{n) and b(n) are the low-order n digits of the 
arguments, and c > is a constant. 
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Proof . The new on-line procedure is nearly identical to the one 
described in Section 2. The only difference lies in the operation of the 
off-line subprocedure. 

Let OFFSP be the off-line multiplication procedure of Lemma 5. Let 
ONSP, HALFSP, and ONLINESP be procedures defined exactly as ON, HALF, and 
ONLINE are in Section 2, except that OFFSP is used as the off-line 

multiplication subprocedure instead of OFF. Let NS(ji, p, q) , HS(n, p, q) , 

and FS(n, p, q) be the maximum number of steps required by ONSP(n) , HALFSP (n) , 
and OFFSP respectively when multiplying n-digit integers a and b. with 
p ^ ones (a) , q ^ ones(b) , and n equal to a power of 2. (Note that 
FS(n, p, q) <, a n(min(p, q, G(n)} + 1) by Lemma 5.) For def initeness , 
assume a is the on-line argument for the half -line computation. For 
arbitrary n, let TS(n, p, q) be the time when the n output digit is 
produced by ONLINESP, assuming that the first n digits of the two inputs 

have p and q ones respectively. 

n/2 n/2 

Split the n-digit integers a and b as a = a -2 + a and b = b^'2 + b Q , 

n / 7 
a n , b n < 2 , and let p. = ones (a.) and q. = ones{b.) for i = 1, 2. By 

% % 1s l* 

the recursive definitions of 0NSP(n) , HALFSP(n) and ONLINESP, and 
maximizing over all such n-digit integers a and b , the following 
relations hold. 

NS(n, p, q) < max{aS(n/2, p^, q Q ) + HS(n/2, q ± , p Q ) (4) 

+ FS{.n/ 2, p v q ± ) + c^ 

I P + P ± = P> <7 + q ± = <7>; 

ES(w, p, <?) < max{ffS(n/2, p Q , ?Q ) + FS(n/2, p Q , q ± ) (5) 

+ HS(n/2, p ± , q Q ) + FS(n/2, p v q^ + o^n 

I p + p ± = p. <7 + ?i = <? } ; 
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Tlog n\ 
TS(n, p, q) < I NS(2 , p , q) + 0(n) . (6) 



Lemma 7. There is a constant a. such that 
4 

HS(n, p, q) < an min{p, q, (log n)'G(n)} + o.n log n 

for all n > 2 with n equal to a power of 2 and all p, q > 0. 

Proof . Proof is by induction on Q, where n = 2 . The base j = 1 is 
immediate by choosing o large enough. 

The induction step follows from Lemma 5 and relations (4) and (5) by 
a straightforward calculation. Choose a, > a + o . Assume the bound of 

Lemma 7 holds for HS(n/2, p, q) where p, q > are arbitrary. Then 
HS(n, p, q) < max{c 1 (n/2)min{p , q Q , (log n/2)-G(n/Z)} + a^n/2) log n/2 

+ c 1 (n/2)(min{p () , q ± , G(n/2)} + 1) 

+ c 1 (n/2)min{p 1 , q Q , (log n/2) -G(n/2)} + o^n/2) log n/2 

+ e 1 (n/2)(min{p 1 , q ± , G*(n/2) } + 1) 

+ c 3 n | p Q + p ± = p, q Q + q ± = q] . 

Note that by distributivity of + over min, it follows that for all integers 
P > P ± > <? . R v 2 Q , z ± , %(min{p , q Q , z Q } + min{p Q , q ± , z ± } + min{p 1 , q Q , z Q } 

+ min{p , q , z }) < min{p + p , q Q + q , s + z }. Therefore, 

HS(n, p, q) < o^n min{p, q, (log(n/2) + l)-G(n/2)} 

+ on log n + (c + c_ - a,)n. 

Since F(n) > 2F(n/2) , then ff(n) > G{n/2) , so the induction step is proved. D 

It is now easy to verify that there are constants a and a' such that 
NS(n, p, q) < c'(n min{p, q, (log n)-G(n)} + n log n) 
for all n > 2 and n equal to a power of 2; and 
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TS(n, p, q) < c{n min{p, q, (log n)'G{n)} + n log n) 
for all n > 2. By the discussion of Section 2.2, it is clear that this 
on-line procedure can be implemented on a Turing machine whose running time 
satisfies conditions (4) - (6). This completes the proof of Theorem 6. D 

Corollary 8 . There is a multitape Turing machine which performs 
on-line integer multiplication within time 

e(n min{ones(a(n)) , ones(b(n)) , (log n) 2 loglog n] + n log n) 
where a(n) and b{n) are the first low-order n bits of the two inputs. 

An interesting open question is whether or not the n log n overhead 
term can be eliminated from Theorem 6. 



4. Other Applications 

4.1. Generalized Linear Products. 

The off-line to on-line conversion can be applied to other computations 
which can be loosely described as convolutional in nature. The generalized 
linear products defined in [3] are one such class. Let a = (<^ n > o...,..., a ) 

and b = (b n , b~, - . . , b ) be two vectors. The linear product with respect 



to ® and ©, written a 



b, is a vector a - (c,,, c, , •••, o , ), where 
' 1 rrfrn 



% ' .® a i • b j ' 

t+j =k 

k = 0, ..., m\n. For this to be meaningful, a., b . e D, e, e E for some 

sets D and E, and ® and © are functions, 

® : D * D -* E, 

© : Z? x E -*■ E, © associative. 



-19- 



The only property needed by the on-line conversion is that the product 
of two length n vectors A and B be obtainable by additions (©) from the 

1> where ^0 = ( - a 0>--> a n/2-l ) ' 



four products A 



V*0 



? r A i 



S , and A 



A l ' (fl «/2 Vl'' % = (b 0'--" & n/2-l ) ' and B l = (6 »/2 W' 

Generalized linear products clearly have this property, and in fact the 
on-line conversion is even simpler than for integer multiplication since 
there are no carries. 

The result thus obtained is that a generalized linear product of two 
length n vectors can be computed on-line with at most 0(0P(n) log ri) uses of 
the basic operations ® and ©, where 0P(n) bounds the number of basic opera- 
tions needed to compute off-line the linear product of two length n vectors, 
providing that OP is monotone and satisfies n < 0P(n) < OP(2n)/2 < k 0P(n) 
for some constant k. 

Polynomial multiplication is an example of a linear product. 
Since polynomials of degree n with complex coefficients can be multiplied 
off-line using the Fast Fourier Transform [2] with only 0(n log n) complex 
additions and multiplications, the corresponding on-line problem can be 
done with at most 0(n (log n) 2 ) such scalar operations. 

In case D and E are finite, the basic operations ® and © can be computed 
in a constant amount of time on a Turing machine, and Theorem 1 then applies. 
A straightforward application is to the problem of on-line multiplication of 
polynomials with coefficients from some finite field. 

Another application yields an 0(n log n) method for on-line recognition 
of palindromes [3]. The on-line conversion is applied to a subroutine for 



performing the linear product off-line in time 0(n) which in turn uses 
as a subroutine the pattern matching algorithm of Morris and Pratt [6] 
Details can be found in [3]. 
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4.2. Multiplication by a Constant. 

One further problem we consider is that of on-line multiplication by a 
constant. Viewed as a function of two arguments, this is an example of a 
half-line computation. All the digits of the constant multiplier are 
available off-line. The digits of the multiplicand are read subject to 
the on-line restriction. The product can clearly be computed in time pro- 
portional to n, the length of the on-line input, where the constant of 
proportionality may depend on the length of the off-line input. Our 
methods give a constant of proportionality smaller than would be obtained 
using classical methods. 

Let y be a fe-bit constant (the off-line argument) , and let x be an 
n-bit multiplicand (the on-line argument) . We divide the bits of x into 
k-blt blocks and consider each block to represent a single digit of the 
base b = 2 representation of x. y can be regarded as a single digit in 
base by and we form the product xy in the straightforward way by multiplying 
the single digit y by the successive base b digits of x, and adding in the 
carries . 

The above method is on-line with respect to base b digits, for the 

i block of k bits (the i digit base b) is produced before any bits of 

st 
the ^+l block are read. To make the method on-line with respect to the 

binary numbers, we need to specify how to compute the values u + V and 

yu on-line, where u and v are arbitrary fe-bit numbers. The former may be 

computed by the ordinary method for addition and requires time 0(k) . To 

do the latter on a Turing machine, even when we are allowed to do arbitrary 

preprocessing on y and k, we know of no way better than to use the on-line 

algorithm of Corollary 2, which requires time 0(k (log fc) 2 loglog k) . This 

has the property that when n < k, the time drops to 0(n (log n) 2 loglog n) . 
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x contains Tn/k~\ blocks of at most ft bits each, so we have proved: 
Theorem 9 . There exists a*f$«il§ijichine 4* and a constant a such that 
for each ft * 4 and every n, if x is a masher of length n and y is a number 
of length .ft, <-tlMa:4fewoa|NiM#^*titt«CiiJ£ r and tutts in 1 



time £ on (log ft) 2 loglog ft. 



*»iPi 






ssto^j-c/xH: 'i;is^i .« '•: ;ii wxao 



The authors 
helpful discussions 
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Figure 1 . Recursively computed multiplication. 
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